---
title: "The Impact of Subway Systems on Air Pollution"
author: 'Shaohan Chang and Ruibo Sun'
abstract: The investigation into how subway system openings affect urban air pollution, with a focus on particulate concentrations, is presented in this abstract. The study discovered that the opening of a subway system was associated with a 4% decrease in particulates in the area of the city center, which persisted over a four-year time horizon, in cities with higher initial pollution levels. In heavily polluted cities, this reduction was projected to have an annual external mortality benefit of about $1 billion. The findings imply that a significant portion of the cost of building subways may be offset by the reduced mortality caused by less air pollution. The results of this study can be repeated in other cities to learn more about the connection between air pollution and subway openings.
thanks: 'The code and the data are in https://github.com/ruibosun/paper2 .The replication of various aspects in this paper are available at  https://doi.org/10.3386/w24183' 
date: 02-20-2023
date-format: "MM/DD/YYYY"  
format: pdf
bibliography: "bib.bib"
---

# 1. Introduction
Air pollution is a significant contributor to the development of disease and premature death, and is widely recognized as the largest environmental risk factor to human health globally @a. Exposure to air pollution increases the risk of premature mortality from heart disease, stroke, and lung cancer. Advances in health and atmospheric sciences have made it possible to estimate the number of deaths and illnesses associated with air pollution, using information from peer-reviewed scientific literature. The relationship between population-level pollution exposure, both short-term and long-term, and adverse health outcomes, including premature death and hospital visits, is expressed through the concentration-response function. From @b, the Institute for Health Metrics and Evaluation and the Health Effects Institute, and the World Health Organization have developed estimates of air pollution-attributable deaths and other adverse health outcomes globally and for individual countries @c.

Globally, air pollution was responsible for 8.7% of deaths in 2017, making it the fifth leading mortality risk, with 4.9 million premature deaths worldwide. Canada is among the top 10 countries with the lowest national PM2.5 exposure levels, according to analyses @d . In Canada, air pollution ranks as the 11th largest risk factor overall for premature death and disability, and is the top environmental risk @e.

Therefore, it is crucial to reduce air pollution by introducing a stainable policy and method. According to LI and YIN @f and Topalovic et al. From @g, subways are the most effective mode of transportation for reducing congestion and greenhouse emissions due to their speed, efficiency, safety, and environmental benefits. Furthermore, as noted by Chen and Whalley, The operation of subways can lead to improvements in air quality, reduced mortality rates resulting from air pollution, increased productivity, and reduced social costs @g.

The paper investigates the impact of subway systems on air pollution in cities around the world. It uses two main data sources: a description of the world's subway systems and a measure of airborne particulates, Aerosol Optical Depth (AOD), recorded by satellites between February 2000 and December 2017. The analysis uses a comparison of changes in AOD within a city before and after the opening of a subway system to establish its causal effect. Most of the data and basic analysis are from Nicolas Gendron-Carrier's paper of "Subways and Urban Air Pollution" @citea . 

The results show that the average effect of subway openings on AOD is a small decrease that cannot be distinguished from zero, but there is significant heterogeneity across cities. In the case of the 26 cities where AOD fell after the subway opened, the decrease was largest in cities whose initial level of AOD was above the median. The decrease in AOD levels was found to persist for at least 4 years. The results also indicate that subway ridership is a key factor in reducing AOD levels and that subway expansions beyond the initial line have small effects on AOD levels.

The study provides important information for policymakers considering the implementation of subway systems to mitigate air pollution. Based on the results, the authors estimate that a subway opening in an average city initially in the top half of the AOD distribution prevents 22.5 infant deaths and 500 total deaths per year, which is worth about \$43m and \$1b per year, respectively. The results suggest that subway systems may be cost-effective in reducing air pollution, particularly in cities with high initial levels of AOD. The study also sheds light on transportation behavior in developing countries, finding no evidence of differences between developing and developed world cities in their response to subways.

The study aims to investigate the effect of subways on urban air pollution by using data from a panel of cities. The subways data, used from Gonzalez-Navarro and Turner (2018) and updated to December 2017, define a subway as an electric-powered urban rail system isolated from interactions with vehicle and pedestrian traffic. The latitude, longitude, and date of opening of every subway station in the world were compiled manually between 2012 and 2014. The air pollution data used are based on remotely sensed measures of suspended particulates from Terra and Aqua satellites, providing daily measures of aerosol optical depth of the atmosphere at a 3 km spatial resolution from 2000 to 2017. The study considers the change in aerosol optical depth (AOD) in the period extending from 18 months before to 18 months after a subway opening, using a sample of 58 subway system openings between 2001 and 2016. The study also has ridership data for 42 of the 58 cities, with an average daily ridership of 130,000 people in the 18th month of operation. The study finds that ridership triples over the first three years of operation and begins to slow after three years. The average time construction began was 77 months prior to the opening.

The results of a study on the Aerial Optical Depth (AOD) within 10 km of city centers using satellite imagery from Terra and Aqua satellites. The study aimed to evaluate the relationship between the remotely-sensed AOD and ground-measured particulate matter (pm10 and pm2.5). The results show that AOD is a highly predictive measure of ground-measured particulate matter and the relationship is not sensitive to the exact region used to calculate city average AOD. The study used a monthly average of AOD readings within 10 km of the city center, calculated by averaging over all pixel-days of AOD readings that fall in this region during the month and weighting by the number of pixels observed each day. The results showed that in 2017, the average AOD reading within 10 km of a city center from the Aqua satellite was 0.40 and higher in Asian cities, whereas it was lower in European and North American cities.

## 1.1 Analysis Tool
The data will be presented clearly and succinctly with these visuals, making it simpler to comprehend the relation between air pollution and subway. This following analysis is processed in R @R  with packages of tidyverse @tidyverse, dplyr @dplyr. The tables is constructed via knitr @knitr. The package inside of tidyverse helps to create the plots via ggplot2 @tidyverse. This paper is knitted as PDF file by the packages of R markdown @RMarkdown.

# 2. Data

All the data that is used for replication in this paper is collected by Nicolas Gendron-Carrier et al. Their study collected data from multiple sources and employed various techniques to analyze the data and draw conclusions about the impact of subway systems on air pollution.

## 2.1 Pollution Measured by Satellite

The study used satellite imagery to measure the concentration of particulates in the air. The study uses data from satellites to measure the aerosol optical depth (AOD) of the atmosphere at a 3km spatial resolution. The Moderate Resolution Imaging Spectrometers on Terra and Aqua earth-observing satellites provide daily AOD measures. The satellite data was collected from the NASA MODIS instrument, which provides measurements of particulate matter at a resolution of 10 kilometers. The satellite data was then combined with ground-based measurements of particulate matter to create a comprehensive dataset that could be used to analyze the impact of subway systems on air pollution @z. 

## 2.2 Subway Data

Furthermore, the study also collected data on the opening of subway systems in various cities around the world. The data included the date of the subway system opening, the location of the subway system, and the size of the subway system. manually collected data on the latitude, longitude, and date of opening of every subway station in the world between January 2012 and February 2014. They updated the data in 2020 using online sources and Google Maps. The data were used to construct a monthly panel that describes the count of operational stations in each subway city between February 2000 and December 2017, which was the time period for which air pollution data were available.

To analyze the impact of subway systems on air pollution, the study used a difference-in-differences approach. This involved comparing the change in particulate concentrations in cities with newly opened subway systems to the change in particulate concentrations in cities without new subway systems. The study also compared the change in particulate concentrations in cities with high initial pollution levels to the change in particulate concentrations in cities with low initial pollution levels.

## 2.3 Possible Confounding Variables

To account for other factors that could affect particulate concentrations, such as changes in weather or seasonal variations, the study used a variety of statistical techniques, including regression analysis and fixed effects models. The study also controlled for other variables that could affect air pollution, such as population density, income levels, and proximity to industrial areas.

# 3. Example in Chile

## 3.1 Ridership in Chile

The effect of subway systems on global urban air pollution is examined. A description of the world's subway networks and a measurement of airborne particles called Aerosol Optical Depth (AOD), which satellites acquired between February 2000 and December 2017, are its two main data sources. To determine the causal relationship between the opening of a subway system and changes in AOD within a city, the analysis compares changes in AOD before and after its establishment.

-   Country: the nation in which the city is situated (EX. Chile)

-   Urban name: the title of the city or town (EX. Valparaiso)

-   Year: either the year of the month (for monthly data) or the year of the observation (for monthly data)

-   Quarter: the observational quarter (for quarterly data)

-   Month: the observation's month (for monthly data)

-   Number of riders for the specified observation, or "ridership"

-   Reference Period: The time frame used as a baseline for the observation (e.g., "Yearly" or "Monthly")

-   The monthly data contains one observation per month, but the annual data contains just one observation per year. It appears that the monthly data spans the months of 2000 and 2020.

In Table 1,The information is limited to the urban area surrounding Valparaiso and does not cover other parts of Chile. Furthermore, the data set offers no details regarding the underlying variables that might be responsible for changes in ridership over time. It is clear to see that the ridership increase a lot since 2005. The ridership keeps increasing each year, which indicates more and more peopel have choose to travel with subway.

```{r,warning = FALSE ,echo=FALSE,message=FALSE}
#| eval: true
#| warning: false
#| echo: false
#| label: tab-one
library(here)
library(dplyr)
library(knitr)
subway_data_raw <- read.csv(file = '/Users/lucas/Documents/paper2/Subway_Ridership_V4.csv')
#subway_data_raw <- read.csv(file = '/Users/xiuzh/Desktop/readingcourse/paper2/Subway_Ridership_V4.csv')

subway_data<-subway_data_raw%>%select(Country, Urbanname, Year, Ridership,ReferencePeriod,Month)

table <- subway_data %>%
  select(Country, Urbanname, Year, Ridership,ReferencePeriod)
  
kable(head(table,5), caption = "Subway City (First 5 rows)")
```

In the @fig-one, with the year on the x-axis, the ridership on the y-axis, and the year denoted by the fill colour, the plot shows ridership statistics for Valparaiso, Chile. The fill colour is a gradient from yellow to red, and the plot is named "Ridership in Valparaiso, Chile". The theme and theme classic functions have been used to alter the plot's theme and font characteristics. The x-axis, y-axis, and fill colour names are specified using the labs function, and the gradient fill colour is specified using the scale fill gradient function. The ridership data is converted to a numeric representation using the as.numeric function, and the data is changed from thousands to tens of thousands by dividing by 1000. This graph is similar to Table 1. Instead of using numbers, this plot provide a more vivid trend of the change in ridership. 

```{r ,fig.width= 5,fig.height= 3,warning = FALSE ,echo=FALSE,message=FALSE}
#| eval: true
#| warning: false
#| echo: false
#| label: fig-one
#| fig-cap: "Ridership in Valparaiso, Chile"

library(ggplot2)

# Plot the data using ggplot
ggplot(subway_data, aes(x = Year,  y = as.numeric(Ridership)/1000, fill = Year)) +
  geom_col() +
  labs(x = "Year", y = "Ridership (in thousands)", fill = "Year") +
  ggtitle("Ridership in Valparaiso, Chile") +
  scale_fill_gradient(low = "yellow", high = "red") +
  theme_classic() +
  theme(plot.title = element_text(hjust = 0.5, size = 14, face = "bold"),
        axis.text.x = element_text(size = 10),
        axis.text.y = element_text(size = 12),
        axis.title.x = element_text(size = 14, face = "bold"),
        axis.title.y = element_text(size = 14, face = "bold"),
        legend.title = element_text(size = 12, face = "bold"),
        legend.text = element_text(size = 12))

```

In @fig-two, the code and plot offered display a depiction of the monthly ridership for the Dominican Republic's Santo Domingo subway system. The month is represented by the x-axis, while the ridership is shown by the y-axis (in thousands).The monthly ridership for a particular month from February 2000 to February 2020 is represented by each red point on the plot. With some seasonal variations, the data demonstrates that ridership has usually increased over time. Also, it appears like there might be a decline in riding in the months that follow February 2020, however it's challenging to make definite predictions in the absence of more information.

```{r,warning = FALSE ,echo=FALSE,message=FALSE}
#| eval: true
#| warning: false
#| echo: false
#| label: fig-two
#| fig-cap: "Monthly Ridership in Santo Domingo, Dominican Republic"

library(dplyr)
library(ggplot2)
subway_data_monthly <- subway_data %>% 
  filter(ReferencePeriod != "Yearly")
ggplot(subway_data_monthly, aes(x = Month, y = as.numeric(Ridership)/1000, group = 1)) +
  coord_flip() +
  geom_point(color = "red") +
  labs(title = "Monthly Ridership in Santo Domingo, Dominican Republic",
       x = "Month", y = "Ridership",
       subtitle = "February 2000 - February 2020") +
  theme_minimal() +
  theme(plot.caption = element_text(size = 10))


```

```{r,warning = FALSE ,echo=FALSE,message=FALSE}
library(dplyr)
library(ggplot2)

# Filter the data to the desired year range and remove rows with missing values
year_range <- subway_data %>% 
  filter(Year >= 2000, Year <= 2010)

```



## 3.2 Subway in Chile
In table two, the table includes information on Valparaiso, Chile's public transit system ridership, among other things. The data provides information on the total number of ridership for each year but not the particular months or quarters during which the ridership was logged. Annually is the reference time. This table is an analysis of Subway's annual ridership data in Chile from 2005 to 2009. After filtering the data to only include records for Chile, the new data is converted the Ridership column to a numeric format by removing commas. The table indicates the total ridership for each year in Chile. It is clear to see that the annual ridership increae each year. Especially in 2005-2006, the ridership increase more than 7,000,000.

```{r,fig.width= 15,fig.height= 20,warning = FALSE ,echo=FALSE,message=FALSE}
#| eval: true
#| warning: false
#| echo: false
#| label: two

library(here)
library(dplyr)

new <- subway_data_monthly%>%filter(Country=="Chile")
#new$Ridership<-as.numeric(new$Ridership)
new$Ridership <- as.numeric(gsub(",", "", new$Ridership))

mytable<-new%>%group_by(Country,Year)%>%summarise(`Total Ridership`=sum(Ridership))


kable(mytable, caption = "Annaul Ridership in Chile (2005-2009)")
```

In comparison to Table 1, Table 2 is the annual ridership in China from 2005 to 2019. As a result, the ridership in China also shows an increasing pattern since 2005. However, there is a sharpe decresae from 2016 to 2017 in terms of the Ridership and it gradually increase back to the orginal level. Since China is a country with large population, the ridership are generally higher than the one in Chile.

```{r,fig.width= 15,fig.height= 20,warning = FALSE ,echo=FALSE,message=FALSE}
#| eval: true
#| warning: false
#| echo: false
#| label: Tab-three
library(here)
library(dplyr)


new <- subway_data_monthly%>%filter(Country=="China")
#new$Ridership<-as.numeric(new$Ridership)
new$Ridership <- as.numeric(gsub(",", "", new$Ridership))

mytable<-new%>%group_by(Country,Year)%>%summarise(`Total Ridership`=sum(Ridership))


kable(mytable, caption = "Annaul Ridership in China (2005-2019)")
```


## 3.3 PM10 in Different Countries

In @fig-four, the plot shows the average annual PM10 values for each nation, expressed in micrograms per cubic metre. The data is organised by nation, and each country's colour is represented by a different bar. The plot offers a visual depiction of the yearly average PM10 values by country and can be used to spot trends or variations in air quality between nations. According to the plot, it is obvious that Afghanistan and Pakistan are the two countries that suffer the most from PM10, followed by Bahrain. Note that the data does not contain the data from some countries. 


```{r ,fig.width= 8,fig.height= 10,warning = FALSE ,echo=FALSE,message=FALSE}
#| eval: true
#| warning: false
#| echo: false
#| label: fig-four
#| fig-cap: "Average of PM10 Yearly"
library(readxl)
raw_pm10<-read_excel("/Users/lucas/Documents/paper2/aap_pm_database_may2014.xls", sheet="countries")
#raw_pm10<-read_excel("/Users/xiuzh/Desktop/readingcourse/paper2/aap_pm_database_may2014.xls", sheet="countries")
raw_pm10$`Annual mean, ug/m3...3`<-as.numeric(as.character(raw_pm10$`Annual mean, ug/m3...3`))
raw_pm10$Year...4<-as.numeric(as.character(raw_pm10$Year...4))
pm10<-raw_pm10%>%select(...2,`Annual mean, ug/m3...3`, Year...4)%>%filter(!is.na(...2))
colnames(pm10)<-c("Country", "PM10 Annual mean, ug/m3","Year")

#plot1
pm10%>%group_by(Country)%>%
  ggplot(aes(x=Country , y=pm10$`PM10 Annual mean, ug/m3`,  fill=Country)) + 
    coord_flip() +
    geom_bar(stat="identity", width=.90) + 
    xlab("") + # Set axis labels
    ylab("") + 
    guides(fill=FALSE) +
    ggtitle("PM10 Annual mean, ug/m3")+
  theme_minimal()

```

In @fig-five, the  mean of PM10 for each nation in each year is displayed on this graph. The scatter diagram for the PM10 Annual mean values by year in the pm10 dataset with two fitted regression lines. The PM10 data to be used for the visualization is first specified in the code, after which the Year variable is mapped to the x-axis and the PM10 Annual mean, ug/m3 variable to the y-axis.
 
```{r ,fig.width= 25,fig.height= 10,warning = FALSE ,echo=FALSE,message=FALSE}
#| eval: true
#| warning: false
#| echo: false
#| label: fig-five
#| fig-cap: "Change of PM10 Annual mean in different Year"

ggplot(pm10, aes(x=Year , y=`PM10 Annual mean, ug/m3`)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE) +
  geom_smooth(method = "loess", se = FALSE, color = "red") +
  geom_text(aes(label = Country), hjust = -0.2, vjust = 1.2, size = 1.5)+
  theme_minimal()
```

```{r,warning = FALSE ,echo=FALSE,message=FALSE}
subway_city <- read.csv(file = '/Users/lucas/Documents/paper2/subway_date_construction_begins_V2_AEJ_revision.csv')
```


Two smoothed lines are added by the code to the plot in order to highlight the data's pattern and overall distributional shape. A linear regression line is the first flattened line, and a smoothed curve called a Loess curve is the second smoothed line.


```{r,warning = FALSE ,echo=FALSE,message=FALSE}
new_subway_city <- subway_city |>
  select(-Links,-Notes) |>
  arrange(Opening.date)
```

In Table 4, the inauguration and construction times of various cities, as well as the plan year and an unnamed variable X, are listed in Table 4. Each community's unnamed infrastructure project is the subject of the material. The infrastructure project's completion year is indicated by the opening date, while the development phase's initial start year is indicated by the beginning building date. The project's original approach was developed during the plan year.

```{r,warning = FALSE ,echo=FALSE,message=FALSE}
#| eval: true
#| warning: false
#| echo: false
#| label: Tab-four
library(knitr)
kable(head(new_subway_city,10), caption = "Subway_City (First 5 rows)")
```

# 4. Conclusion

The paper discussed here is a study to investigate the impact of subway systems on air pollution in cities around the world. The study used two main data sources: satellite-recorded descriptions of the world's subway systems between February 2000 and December 2017 and measurements of the aerosol optical depth (AOD) of particulate matter in the air. The analysis used a comparison of changes in AOD in Chile before and after the opening of the subway system to establish its causality.

The findings of this study provide important information for policymakers considering implementing a subway system to mitigate air pollution. The authors estimate that opening subways in an average city originally located in the upper half of the AOD distribution could prevent 22.5 infant deaths and 500 total deaths per year, worth about \$43 million and \$1 billion per year, respectively. The results suggest that subway systems may be cost-effective in reducing air pollution, especially in cities with high initial levels of AOD. @citea

The study also sheds light on traffic behavior in developing nations, concluding that there are no differences between how developed and developing cities react to subways. This is a significant finding because developing nations may have a greater need for affordable air pollution solutions.

The Terra and Aqua satellites' remote sensing measurements of suspended particulate matter served as the basis for the study's data. Using satellite imagery from the Terra and Aqua satellites, the study evaluated the correlation between remotely sensed AOD and ground-based measurements of particulate matter (pm10 and pm2.5) within 10 km of the city center. The findings demonstrate that AOD is a highly reliable predictor of ground-measured particulate matter and that there is no significant relationship with the precise area used to derive the city-average AOD. The study made use of monthly averages of AOD readings for pixel days that occurred within a 10-kilometer radius of the city center during the month, weighted by the number of pixels observed on each day.

There is also study which conducted in China. The results of the study match with our findings. It further suggests two policy recommendations. Firstly, the government should actively promote green transportation as an alternative to motor vehicles. This can be achieved by increasing the availability of green vehicles and reducing the number of public motor vehicles to reduce emissions. Secondly, the development of subway systems should be rationally planned to minimize the negative impact on urban air pollution, particularly PM2.5. The government can increase the accessibility of subways by developing a complex subway network, which can help meet residents' travel demands and reduce emissions. However, expanding existing subway lines may not be enough to change people's travel mode, so it may be more effective to increase the number of subway lines. Since subway projects are expensive, a rational plan for the subway network can help achieve the goal of reducing air pollution while also considering the fiscal investment required @g. The study's use of changes in AOD levels to gauge the effect of the subway system on air pollution has some limitations. Although AOD is a trustworthy indicator of air pollution, it is only one facet of the intricate issue of urban air pollution. Future studies should look into how the subway system affects other pollutants and the overall quality of the air.

In conclusion, the studies covered in this article offer crucial details on how subway systems affect air pollution in urban areas around the globe. The findings imply that subway systems might reduce air pollution in cities with high initial levels of AOD at a reasonable cost. The findings of this study could be used by policymakers to decide whether to implement subway systems as a way to reduce air pollution. 

# Claims identified by reproducer

1. Subway systems can reduce air pollution: The study found that opening subways in an average city could prevent infant deaths and total deaths per year, suggesting that subway systems may be cost-effective in reducing air pollution, especially in cities with high initial levels of aerosol optical depth (AOD).

2. No difference in traffic behavior between developed and developing cities: The study concluded that there are no differences in the way developed and developing cities react to subways, which is a significant finding for developing nations that may have a greater need for affordable air pollution solutions.

3. AOD is a reliable predictor of ground-measured particulate matter: The study used satellite imagery and remote sensing measurements of suspended particulate matter to evaluate the correlation between remotely sensed AOD and ground-based measurements of particulate matter (pm10 and pm2.5). The findings demonstrated that AOD is a highly reliable predictor of ground-measured particulate matter.

\newpage

# Reference
